Close

1. Identity statement
Reference TypeConference Paper (Conference Proceedings)
Sitesibgrapi.sid.inpe.br
Holder Codeibi 8JMKD3MGPEW34M/46T9EHH
Identifier8JMKD3MGPAW/3S4ELD8
Repositorysid.inpe.br/sibgrapi/2018/10.24.00.46
Last Update2018:10.24.00.46.42 (UTC) felipe.duarte@itau-unibanco.com.br
Metadata Repositorysid.inpe.br/sibgrapi/2018/10.24.00.46.42
Metadata Last Update2022:05.18.22.18.35 (UTC) administrator
Citation KeyKuboNazAguOliDua:2018:UsUNPr
TitleThe usage of U-Net for pre-processing document images
FormatOn-line
Year2018
Access Date2024, May 01
Number of Files1
Size906 KiB
2. Context
Author1 Kubo, Diandra Akemi
2 Nazare, Tiago Santana de
3 Aguirre, Priscila Louise Ribeiro
4 Oliveira, Bruno Domingues
5 Duarte, Felipe Simões Lage Gomes
Affiliation1 Data Science Team - Itau Unibanco
2 Data Science Team - Itau Unibanco
3 Data Science Team - Itau Unibanco
4 Data Science Team - Itau Unibanco
5 Data Science Team - Itau Unibanco
EditorRoss, Arun
Gastal, Eduardo S. L.
Jorge, Joaquim A.
Queiroz, Ricardo L. de
Minetto, Rodrigo
Sarkar, Sudeep
Papa, João Paulo
Oliveira, Manuel M.
Arbeláez, Pablo
Mery, Domingo
Oliveira, Maria Cristina Ferreira de
Spina, Thiago Vallin
Mendes, Caroline Mazetto
Costa, Henrique Sérgio Gutierrez
Mejail, Marta Estela
Geus, Klaus de
Scheer, Sergio
e-Mail Addressfelipe.duarte@itau-unibanco.com.br
Conference NameConference on Graphics, Patterns and Images, 31 (SIBGRAPI)
Conference LocationFoz do Iguaçu, PR, Brazil
Date29 Oct.-1 Nov. 2018
PublisherSociedade Brasileira de Computação
Publisher CityPorto Alegre
Book TitleProceedings
Tertiary TypeIndustry Application Paper
History (UTC)2018-10-24 00:46:42 :: felipe.duarte@itau-unibanco.com.br -> administrator ::
2022-05-18 22:18:35 :: administrator -> :: 2018
3. Content and structure
Is the master or a copy?is the master
Content Stagecompleted
Transferable1
Keywords#deep-learning #computer-vision #image-processing
AbstractWhen processing documents in real-world scenarios, it is common to deal with artifacts that may hamper document analysis, such as stamps, noise and strange backgrounds. Aiming to mitigate these problems, we propose the use of U-Net, a very successful biomedical image segmentation network, for handwritten and machine text segmentation. In order to do so, we trained a model for each type of text. One of the main advantages presented is that the models are trained on artificial data, avoiding the wearisome task of data labeling. For the machine text segmentation model, we test its impacts on both word and character recognition when combined with the Tesseract OCR model. For the handwritten segmentation model, we present qualitative results. Initial experiments indicate that both models are able to improve results in their respective applications.
Arrangementurlib.net > SDLA > Fonds > SIBGRAPI 2018 > The usage of...
doc Directory Contentaccess
source Directory Contentthere are no files
agreement Directory Content
agreement.html 23/10/2018 21:46 1.2 KiB 
4. Conditions of access and use
data URLhttp://urlib.net/ibi/8JMKD3MGPAW/3S4ELD8
zipped data URLhttp://urlib.net/zip/8JMKD3MGPAW/3S4ELD8
Languageen
Target Filesibgrapi_pi_cv.pdf
User Groupfelipe.duarte@itau-unibanco.com.br
Visibilityshown
Update Permissionnot transferred
5. Allied materials
Mirror Repositorysid.inpe.br/banon/2001/03.30.15.38.24
Next Higher Units8JMKD3MGPAW/3RPADUS
Citing Item Listsid.inpe.br/sibgrapi/2018/09.03.20.37 8
Host Collectionsid.inpe.br/banon/2001/03.30.15.38
6. Notes
Empty Fieldsarchivingpolicy archivist area callnumber contenttype copyholder copyright creatorhistory descriptionlevel dissemination doi edition electronicmailaddress group isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder schedulinginformation secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume


Close